Attaching package: 'janitor'
The following objects are masked from 'package:stats':
chisq.test, fisher.test
library(sf)
Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(leaflet)
Warning: package 'leaflet' was built under R version 4.2.3
20.2 Data
We’re loading two dataframes to answer our question.
Washington Homicides | spatial points | location and details about each Washington homicide between 2007 and 2016. Details on the data: https://github.com/washingtonpost/data-homicides
Washington Schools | spatial points | locations of each Washington, D.C. school. https://nces.ed.gov/programs/edge/geographic/schoollocations
Washington D.C. shapefile | shapefile | outline of Washington, D.C.
#### Load dataframe of washington homicides###washington_homicides <-read_rds("assets/data/washington_homicides.rds")#### Load dataframe of washington schools###washington_schools <-read_rds("assets/data/washington_schools.rds")#### Load outline of DC###washington_shapefile <-read_rds("assets/data/washington_shapefile.rds")
Let’s think back to our question: how many homicides occurred near schools?
In the prior lab, we used something called a “spatial join” to tie a school – a geographic point – to a Census tract – a geographic polygon.
We could try something similar here. The code below says: create a new object called schools_homicides by attempting to connect each point in our schools table with each point in our homicides table. If there’s an overlap, return that match to our new table.
Simple feature collection with 0 features and 15 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Geodetic CRS: NAD83
# A tibble: 0 × 16
# ℹ 16 variables: school_name <chr>, address <chr>, city.x <chr>,
# state.x <chr>, zip <chr>, geometry <GEOMETRY [°]>, uid <chr>,
# reported_date <dbl>, victim_last <chr>, victim_first <chr>,
# victim_race <chr>, victim_age <chr>, victim_sex <chr>, city.y <chr>,
# state.y <chr>, disposition <chr>
Notice that it returned zero rows. That’s because for there to be a match, a school point had to be essentially on top of a shooting point. The points are very small, and it’s unlikely there will be an exact match in our data.
We can fix that by making the size of the point bigger, by converting it to a polygon. How do we define “near” for the purpose of our question? Let’s use “within 100 meters” of the centerpoint of the school as our definition.
We’re going to use a new function st_buffer to create a “buffer” around our latitude longitude point. We’re going to change our geometry from a small point to a circle with a radius of 100 meters centered on that point.
Simple feature collection with 239 features and 5 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -77.10182 ymin: 38.82285 xmax: -76.91885 ymax: 38.98553
Geodetic CRS: NAD83
# A tibble: 239 × 6
school_name address city state zip geometry
* <chr> <chr> <chr> <chr> <chr> <POLYGON [°]>
1 Cesar Chavez PCS for Pub… 3701 H… Wash… DC 20019 ((-76.94809 38.90084, -7…
2 Cesar Chavez Public Char… 3701 H… Wash… DC 20019 ((-76.94809 38.90084, -7…
3 Friendship PCS - Collegi… 4095 M… Wash… DC 20019 ((-76.94558 38.89821, -7…
4 Friendship PCS - Southea… 645 MI… Wash… DC 20032 ((-76.99599 38.84797, -7…
5 Friendship PCS - Technol… 2705 M… Wash… DC 20032 ((-76.99592 38.84915, -7…
6 Friendship PCS - Blow Pi… 725 19… Wash… DC 20002 ((-76.97583 38.89864, -7…
7 Friendship PCS - Blow Pi… 725 19… Wash… DC 20002 ((-76.97583 38.89864, -7…
8 Friendship PCS - Chamber… 1345 P… Wash… DC 20003 ((-76.9878 38.87966, -76…
9 Friendship PCS - Woodrid… 2959 C… Wash… DC 20018 ((-76.96495 38.92914, -7…
10 Friendship PCS - Woodrid… 2959 C… Wash… DC 20018 ((-76.96495 38.92914, -7…
# ℹ 229 more rows
Let’s visualize it. It’s hard to see the difference from this map, but those purple circles represent a 100 meter radius circle around our school point.
Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
Need '+proj=longlat +datum=WGS84'
Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
Need '+proj=longlat +datum=WGS84'
We can also see the difference in the geometry column between our buffered dataframe and our point dataframe.
Our washington_schools dataframe is a collection of points.
washington_schools$geometry
Geometry set for 239 features
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -77.10065 ymin: 38.82376 xmax: -76.92001 ymax: 38.98462
Geodetic CRS: NAD83
First 5 geometries:
POINT (-76.94906 38.90134)
POINT (-76.94906 38.90134)
POINT (-76.94631 38.89751)
POINT (-76.997 38.84754)
POINT (-76.99629 38.8483)
Our washington_schools_buffered dataframe is a collection of polygons
washington_schools_buffered$geometry
Geometry set for 239 features
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -77.10182 ymin: 38.82285 xmax: -76.91885 ymax: 38.98553
Geodetic CRS: NAD83
First 5 geometries:
Simple feature collection with 87 features and 15 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -77.04171 ymin: 38.82285 xmax: -76.91885 ymax: 38.9614
Geodetic CRS: NAD83
# A tibble: 87 × 16
school_name address city.x state.x zip geometry uid
* <chr> <chr> <chr> <chr> <chr> <POLYGON [°]> <chr>
1 Friendship PCS … 4095 M… Washi… DC 20019 ((-76.94558 38.89821, -7… Was-…
2 Friendship PCS … 4095 M… Washi… DC 20019 ((-76.94558 38.89821, -7… Was-…
3 Friendship PCS … 645 MI… Washi… DC 20032 ((-76.99599 38.84797, -7… Was-…
4 Friendship PCS … 1351 N… Washi… DC 20011 ((-77.03121 38.95965, -7… Was-…
5 Friendship PCS … 645 MI… Washi… DC 20032 ((-76.99599 38.84806, -7… Was-…
6 Moten ES 1565 M… Washi… DC 20020 ((-76.98158 38.8566, -76… Was-…
7 Stanton ES 2701 N… Washi… DC 20020 ((-76.96953 38.85878, -7… Was-…
8 Sousa MS 3650 E… Washi… DC 20019 ((-76.95386 38.88319, -7… Was-…
9 Savoy ES 2400 S… Washi… DC 20020 ((-76.99378 38.86371, -7… Was-…
10 Marie Reed ES 2201 1… Washi… DC 20009 ((-77.0414 38.91974, -77… Was-…
# ℹ 77 more rows
# ℹ 9 more variables: reported_date <dbl>, victim_last <chr>,
# victim_first <chr>, victim_race <chr>, victim_age <chr>, victim_sex <chr>,
# city.y <chr>, state.y <chr>, disposition <chr>
We get 87 rows. This does NOT mean:
We had 87 homicides near schools. Some individual homicides occurred within 100 meters of more than one school.
There were homicides near 87 schools. Some schools had multiple homicides.
To answer our original question, “how many homicides occurred near (within 100 meters) of a DC school?”, we need to distinct our list of schools and count them.
There’s our answer. Out of 239 DC schools, 65 – more than 25 percent – had a homicide occur within 100 meters.
Question #1: Answer this question in English: Write in Elms
Develop two questions that you could answer with a spatial buffer as described in this chapter, outside of the domain of education and crime. Briefly describe the two geographic dataframes you would need to answer each question.